arXiv:1508.07891v4 [q-fin.TR] 14 Nov 2016 


A REDUCED-FORM MODEL FOR LEVEL-1 LIMIT ORDER 

BOOKS 

TZU-WEI YANG AND LINGJIONG ZHU 


Abstract. One popular approach to model the limit order books dynamics 
of the best bid and ask at level-1 is to use the reduced-form diffusion approx¬ 
imations. It is well known that the biggest contributing factor to the price 
movement is the imbalance of the best bid and ask. We investigate the data 
of the level-1 limit order books of a basket of stocks and study the numerical 
evidence of drift, correlation, volatility and their dependence on the imbal¬ 
ance. Based on the numerical discoveries, we develop a nonparametric discrete 
model for the dynamics of the best bid and ask, which can be approximated 
by a reduced-form model that incorporates the empirical data of correlation 
and volatilities with analytical tractability that can fit the empirical data of 
the probability of price movement. 


1. Introduction 

The traditional human traders have largely been replaced by the automatic and 
electronic traders in today’s financial world. The role and the controversy of those 
high frequency traders have caught public’s attention ever since the infamous flash 
crash on May 6, 2010 and the long-standing debate over the fairness of equity 
markets briefly became salient with the recent publication of the book “Flash Boys” 
by Michael Lewis, who argued that the trading has become unfair and skewed by 
the high-frequency trading and dampened the opportunities of the regular investors. 

In automatic and electronic order-driven trading platforms, orders arrive at the 
exchange and wait in the limit order book. There are two types of orders in the 
limit order book: market orders and limit orders. Cancellation is also allowed. One 
of the key research areas in limit order books has been centered around modeling 
the limit order book dynamics. In this paper, we only consider the limit order book 
model at level-1, that is, we only study the dynamics of the volumes at the best 
bid and the best ask. 

The limit order book is a discrete queuing system and many of the works in the 
literature model study the dynamics of the limit order book in a discrete setting 
directly, see e.g. Cont et al. [5], Abergel and Jedidi [T]. Another popular approach, 
is to study the reduced-form of the discrete model. In the sense of heavy traffic 
limits, various authors, see e.g. Cont and de Larrard [21 m, Avellaneda et al. [2], 
Guo et al. |8] considered the diffusion limit as an approximation of the discrete 
model. The diffusion approximation is valid if the average queue sizes are much 
larger than the typical quantity of shares traded and the frequency of orders per 
unit time is high, see e.g. the discussions in Avellaneda et al. |2]. With the 
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empirical finding of approximate scale invariance, Bouchaud et al. [7] derive a 
two dimensional Fokker-Planck equation describing the statistical behavior of the 
queue dynamic. In the recent work by Huang et al. cni, they introduced a model 
which accommodates the empirical properties of the full order book and the stylized 
facts of lower frequency financial data. In their model, the order flows have state- 
dependent intensities. We refer to the recent book [12] for more details. 

One research area of great interests is the dynamics of the limit order books and 
how it influences the stock price movement, see e.g. Avellaneda et al. |2], Cont et 
al. 13, Huang and Kercheval jS] etc. There is strong empirical evidence to suggest 
the biggest factor that drives the movement for the stock price to the next level is 
the imbalanee of the best bid and the best ask, see e.g. Avellaneda et al. [2], which 
is defined as the ratio of the volume at the best bid and total volume at the best 
bid and ask: 


( 1 . 1 ) 


Imbalance = 


Volume at Best Bid 

Volume at Best Bid -I- Volume at Best Ask ’ 


In the limit order book, the stock price will move up when the best ask queue 
is depleted and the price will move down when the best bid queue is depleted. 
The empirical data suggests that the probability that the stock price will move up 
increases as the imbalance increases. One can think the probability of price moving 
up as a monotonically increasing function of the current imbalance. One may 
expect that the probability of the stock price moving up is a monotonic function 
from 0 to 1 as the imbalance increases from 0 to 1. But empirical evidence suggests 
otherwise. In Avellaneda et al. [2| , they discovered that even though the probability 
of the stock price moving up is indeed an increasing function of the imbalance, it 
increases from a positive value to a value less than one. One explanation is the 
hidden liquidity, that is, the sizes that are not shown in the limit order book, see 
[2|. As it is hypothesized in [2|, there can be two explanations for hidden liquidity. 
First, markets are fragmented and it can happen that once the best ask on an 
exchange is depleted, the price will not necessarily go up since an ask order at that 
price may still be available on another market and a new bid cannot arrive until 
that price is cleared on all markets. Second, there exist so-called iceberg orders, 
the trading algorithms that split large orders into smaller ones that refill the best 
quotes as soon as they are depleted. 

Indeed, we also discover the hidden liquidity in our numerical analysis and we 
use the idea of the hidden liquidity to better fit the model. The numerical evidence 
suggests that empirical probability of price moving up depends linearly on the 
imbalance, with hidden liquidity at the very small and large imbalance levels, see 


Figures 10 11 12 13 In [2|, correlated Brownian motions are used as a reduced- 


form model to describe the dynamics at the best bid and ask queues. The linear 
dependence of the empirical probability of price moving up on the imbalance level 
suggests that in the correlated Brownian motions model, the correlation should be 
exactly —1. However, we carried out numerical analysis to study the correlation and 
volatilities of the best bid and ask sizes and their dependence on the imbalance, 
and found out that the correlation is negative, but far away from —1 and it is 
also dependent on the level of the imbalance. Therefore, the correlated Brownian 
motions might be too simplistic explaining the dynamics of the best bid and ask 
queues. In this paper, we will build up a non-parametric model that can fit the 
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data of the empirical correlation, empirical volatilities of the best bid and ask sizes 
and the empirical probability of price moving up simultaneously. 

In this paper, we will use data to study the level-1 limit order books for a 
basket of stocks, to further understand the dependence on the imbalance of the 
best bid and ask sizes. We will look at a basket of stocks and also compare the 
results amongst different exchanges, in particular, NASDAQ and NYSE because 
our empirical data suggest that the stocks we selected have the largest trading 
volumes in these two stock exchanges. We discover that the micro structure and 
the dynamics of the limit order books depend on their exchanges, in the sense, that 
the key statistics like correlation between the best bid and ask, and the drift effect 
at the best bid and ask queues can differ across the exchanges. The discrepancy 
amongst the exchanges has caught a lot of attention lately. As noted in a recent 
article on Wall Street Journal m- 

“There is no question that U.S. equity markets are fragmented. 

The New York Stock Exchange’s share of trading in its listed stocks 
has dropped to 32% of its volume from 77% a decade ago... This 
fragmentation... also creates arbitrage opportunities that did not 
exist when trading markets were unified.” 

In our empirical study, we discover the evidence of discrepancy amongst different 
exchanges and also across different stocks. This has two possible implications. 
Eirst, the discrepancy can possibly be explained by the different trading patterns 
of different algorithmic traders. Say we have two high frequency traders A and B, 
who are trading two different baskets of stocks. Then the different behavior of the 
dynamics of the limit order books of the stocks may be due to the different trading 
strategies and patterns of these two different players. As we will see later from our 
empirical studies, different stocks are concentrated in different exchanges. Hence 
the different trading strategies of the traders behind different stocks can result in 
the discrepancy across the different exchanges. Second, the fragmentation of the 
stock exchanges may intrinsically cause the difference of the dynamics of the limit 
order books on different exchanges, especially when the imbalance of the best bid 
and ask is either small or large, that is when the queues at the best bid and ask are 
near depletion. In these cases, new orders may be directed to a different exchange 
when liquidity is still available. Thus the fragmentation of the stock exchanges and 
the discrepancy of the limit order books dynamics across different exchanges may 
create arbitrage opportunities. As the same article on WSJ pointed out, 

“Transparency disappears behind a shroud of complex order types 
executed on vaguely sinister dark pools, trading venues that some¬ 
times are used to disadvantage long-term investors... The remedy 
is to create multiple trading venues and then limit trading in a 
particular security to one of them.” 

The paper is organized as follows. In Sectionj^ we carry out the data analysis to 
study the empirical evidence of the dependence of the dynamics at the best bid and 
ask on the imbalance. Based on the empirical evidence, we build a nonparametric 
reduced-form model in Section with analytical tractability, hence extending the 
existing reduced-form and diffusion approximation approach of the level-1 limit 
order book dynamics in the literature. The conclusion of the paper is in Section 
Einally, the technical proofs are in Appendix and the tables are in Appendix [B| 
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Table 1. An Example of the Raw Data. Source: Wharton Re¬ 
search Data Services (WRDS). 


Ticker 

Date 

Time 

Bid 

Ask 

Bid Size 

Ask Size 

Exchange 

GM 

20140102 

10:12:44 

40.55 

40.57 

9 

33 

N 

GM 

20140102 

10:12:44 

40.55 

40.57 

7 

33 

N 

GM 

20140102 

10:12:44 

40.56 

40.57 

4 

4 

T 

GM 

20140102 

10:12:44 

40.56 

40.57 

4 

6 

T 

GM 

20140102 

10:12:44 

40.56 

40.57 

1 

2 

P 

GM 

20140102 

10:12:44 

40.56 

40.57 

3 

6 

T 


2. Data Analysis 

For our data analysis, we use the consolidated quotes of the NYSE-TAQ data set 
from the Wharton Research Data Services (WRDS). We look at the level-1 data, 
that is, the best bid price, best ask price, best bid size and best ask size for a 
basket of stocks traded on different exchanges. The time window of the data set 
we selected is the first five trading days of 2014, that is, January 2nd, 3rd, 6th, 
7th and 8th. We only consider the trades happened between 10am and 4pm of the 
trading days. We exclude the pre-market and after-market data as well as the data 
from the first half an hour of each trading day, i.e., 9:30-10am since these data are 
usually quite noisy. We concentrate on the studies of the following stocks: Bank of 
America (BAG), General Electric (GE), General Motors (GM) and JP Morgan & 
Ghase (JPM). These blue-chip stocks have large market capitalization, are highly 
liquid and have large trading volumes. Moreover, the average prices of these stocks 
are reasonably small and the bid-ask spreads are narrow (one tick size most of 
time), so that the data sets are not too noisy. A sample table of the raw data we 
are using is given in Table as an illustration. In Table the units of best bid 
and ask prices are US Dollars and the units of the best bid and ask sizes are 100. 

Out empirical observation from the raw data, shows that for most of time, there 
is only one size of the bid and ask queues changes, while the other size remains the 
same. Such a pattern would not be an appropriate approximation of a diffusion 
process that we consider in Section and the empirical correlation between the bid 
and ask queue sizes is almost zero. We therefore average the consecutive data when 
there is only one queue size varying while the other one remains the same. In this 
way, the data would have a better approximation to a diffusion process. 

We found out that the largest volumes for these stocks we studied are traded on 
NASDAQ (T) and NYSE (N)Q see Figure 0 The symbols in Figure stand for 
the exchanges that the stocks are traded at and the details are given in Table 

First, we investigate the drift effect of the best bid and ask queue lengths. People 
have used both driftless diffusions, see e.g. [2] and diffusions with constant drift, 
see e.g. HIH] to model the dynamics of the best bid and ask queues. Therefore, 
we are interested in seeing from the data set if there is indeed any evidence of 


^That is not always the case. For example, for the first five trading days of 2014, the exchanges 
that trade the largest volumes of Walmart (WMT) are NYSE (N) and BATS (Z) and the exchanges 
that trade the largest volumes of Microsoft (MSFT) are NASDAQ (T) and BATS (Z). For the 
comparison purposes, we only study those stocks with top two exchanges being NASDAQ and 
NYSE. 
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Table 2. Primary Listed Exchange Codes 


Code 

Exchange 

Code 

Exchange 

B 

NASDAQ OMX BX 

P 

NYSE Area SM 

C 

National Stock Exchange 

T 

NASDAQ OMX 

J 

Direct Edge A Stock Exchange 

W 

CBOE Stock Exchange 

K 

Direct Edge X Stock Exchange 

X 

NASDAQ OMX PSX 

M 

Chicago Stock Exchange 

Y 

BATS Y-Exchange 

N 

New York Stock Exchange 

Z 

BATS Exchange 


Percentages of Number of Orders of 
Bank of America in Different Exchanges 

B:10.1% Z:8.51% 



Percentages of Number of Orders of 
General Electric in Different Exchanges 

B:8.6% Z:7.04% 




Percentages of Number of Orders of 
General Motors in Different Exchanges 


Percentages of Number of Orders of 
JPMorgan Chase & Co. in Different Exchanges 

B:4.54% 

C:2.34% Z:13.6% 



Figure 1. Pie Charts of Bank of America, General Electric, Gen¬ 
eral Motors, and JP Morgan & Chase 


the drift in the dynamics of the best and bid queues. If there is any evidence of 
drift, is the drift a constant, or a function depending on the queue lengths and the 
imbalance of the best bid and ask? Our studies are summarized in Figure for 
Bank of America, Figurefor General Electric, Figure]^ for General Motors and 
Figure for JP Morgan & Chase. For example, in Figure we study the total 














6 


TZU-WEI YANG AND LINGJIONG ZHU 


volumes for the positive changes and the negative changes at the best bid queues, 
best ask queues for both NASDAQ and NYSE for the Bank of America stock. In 
all these plots, the purple bars stand for the total volume of the negative changes 
at a particular imbalance level and the yellow bars stand for the total volume of 
the positive changes at a particular imbalance level. The red lines denote the ratio 
of the total volume of the positive changes to the total volume of all the changes. 

We can see that when the imbalance is neither too small or too large, there is 
little evidence of the drift in the best bid and ask queues. 

On the other hand, when the imbalance is very small or very large, we do observe 
drifts. From the top left picture in Figure]^ it is clear that there is evidence of 
negative drift at the best bid queue when the imbalance is small (and hence the 
queue length is short) and little evidence of drift otherwise. From the top right 
picture in Figure it is clear that there is evidence of negative drift at the best 
ask queue when the imbalance is small (and hence the queue length is short) and 
little evidence of drift otherwise. On the other hand, from the bottom two pictures 
in Figure]^ we observe that exactly the opposite is true for the Bank of America 
stock traded on NYSE, that is, there is positive drift at the best bid queue when the 
imbalance is small and at the best ask queue when the imbalance is large, although 
the drift effect is weak. One possible explanation is that there can be different 
scenarios when the queue lengths are short. For example, it can happen that the 
queue length is short when the traded stock is about to move to the next price level. 
There is the clustering effect of market orders and cancellations of the limit orders 
that can explain the negative drift we observed in the top two pictures in Figure 

On the other hand, it is also possible that the queue length is short because it 
is a new queue and there is clustering effect of the arrivals of new limit orders at 
the new queue, which results in the positive drift we observed in the bottom two 
pictures in Figure Similar patterns are also observed for the General Electric 
stock, see Figure On the other hand, we see in Figure]^ that for the General 
Motors stock, for both NASDAQ and NYSE, there is positive drift at the best bid 
queue when the imbalance is small and at the best ask queue when the imbalance 
is large and there is little drift otherwise. Similar patterns also hold for the JP 
Morgan & Chase stock, see Figure 

The statistics are summarized in Table [3] and Table IH BAG b and BAG a stand 
for the best bid and the best ask queues for BAG respectively. The number in each 
cell is obtained by computing 

Volume of Positive Change 

Volume of Positive Change + Volume of Negative Change 

From Table and Table we can see that except when the imbalance is small or 
large, there is little evidence of the drift in the best bid and ask queues. In terms of 
modeling, this suggests that we can build up a model with no drift effect when the 
imbalance is neither too small or too large. With the small and large imbalance, 
instead of modeling the drift effect by adding a drift term in the dynamics, we use 
the idea of the hidden liquidity from [5] to better fit the model and explain the 
dynamics at the best bid and ask queues. 

Next, we investigate the correlation between the best bid and ask dynamics, the 
volatilities at the best bid and ask queues, and their dependence on the imbalance. 
We summarized our observations for the Bank of America stock in Figure the 
General Electric stock in Figurethe General Motors stock in Figure]^ and the 
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Best Bid Queue 


Best Ask Queue 



0.48 

0.46 



Imbalance 

Best Bid Queue 
Bank of America at NYSE 


Imbalance 

Best Ask Queue 
Bank of America at NYSE 


■ Negative Changes 
□ Positive Changes 




□ayiiiy^o 



Imbalance 


Imbalance 


Figure 2. Positive and Negative Changes of the Volumes at the 
Best Bid and the Best Ask of Bank of America at NASDAQ and 
NYSE. The curve is the ratio of the positive changes to the total 
changes. 


JP Morgan & Chase stock in Figure For example, let us take a look at the 
summary for the Bank of America stock in Figure The top row in Figure 
stands for the correlation between the size changes at the best bid and the best ask 
(top left), the standard deviation of size changes at the best bid (top middle), and 
the standard deviation of size changes at the best ask (top right) for the Bank of 
America stock traded in NASDAQ. Similar statistics for the Bank of America stock 
traded in NYSE are summarized in the bottom row in Figure]^ As we can see from 
the top left picture, the correlation as a function of the imbalance, is a kF-shaped 
curve for the Bank of America stock traded in NASDAQ and from the bottom left 
picture a {/-shaped curve for the Bank of America stock traded in NYSE. Similar 
pattern is observed also for the General Electric stock traded in NASDAQ and 
NYSE, see Figure The {/-shaped curve is observed for General Motors and JP 
Morgan & Chase traded in both NASDAQ and NYSE, see Figureand Figure]^ 
Indeed, we studied some other stocks as well in the WRDS database and empirical 
studies suggest that {/-shape curves and VF-shaped curves are universal for the 
correlation between the size changes at the best bid and ask for most stocks. It 
also holds that the correlation in general is negative but is far away from —I. It 
is curious why a typical relation of the correlation between the size changes at the 
best bid and ask and the imbalance of the best bid and ask can be represented 
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Best Ask Queue 


Best Bid Queue 
General Electric at NASDAQ 




Imbalance 

Best Bid Queue 

ot KIVCC 


0.2 0.4 0.6 0.8 

Imbalance 

Best Ask Queue 
General Electric at NYSE 



Imbalance 


Imbalance 


Figure 3. Positive and Negative Changes of the Volumes at the 
Best Bid and the Best Ask of General Electric at NASDAQ and 
NYSE. The curve is the ratio of the positive changes to the total 
changes. 


by either a [/-shaped curve or a VF-shaped curve. It is also worth noting that 
sometimes we get different shaped curves for different exchanges (Eigure|^ Figure 
and sometimes we get the same shaped curves for different exchanges (Figure]^ 
Figure]^. That can probably be explained by the fact that some high frequency 
and algorithmic trading firms apply their trading strategies to a particular stock 
exchange only and the different trading strategies result in the different patterns of 
the best bid and ask dynamics we observed from the data. Figures §0(811^ also 
contain the information about the standard deviations of the size changes at the 
best bid and best ask queues on NASDAQ and NYSE. The general observation is 
that most of the time, the standard deviation increases as the imbalance increases 
at the best bid queues and decreases as the imbalance increases at the best ask 
queues. Note that best bid size increases as imbalance increases and best ask size 
decreases as imbalance increases. Hence, what we observed is that the standard 
deviations increases as the queue lengths increases. This is not surprising at all. 
But what’s interesting is that in many cases, it is not exactly monotone and we 
see a sudden increase of the standard deviation when the imbalance is small for 
the best bid and large for the best ask, that is, when the queue length is short. 
That suggests that when the queue length is short, that is when the queue is about 
to get deleted, or when there is a new queue created, the volatilities tend to be 
large. In general, the volatilities of the empirical data tend to be noisier than the 
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Best Bid Queue 



Best Ask Queue 
General Motors at NASDAQ 



Imbalance 

Best Bid Queue 

r^anaral KAntni'e at KtVQC 


Imbalance 

Best Ask Queue 
Generai Motors at NYSE 



Imbalance 


Imbalance 


Figure 4. Positive and Negative Changes of the Volumes at the 
Best Bid and the Best Ask of General Motors at NASDAQ and 
NYSE. The curve is the ratio of the positive changes to the total 
changes. 


correlations, which is either a {/-shaped or a VF-shaped curve. Nevertheless, it is 
quite often to observe the skewed {/-shaped curves. For example, in top middle 
and top right pictures in FigureFigure Figure]^ and Figurewe have the 
skewed {/-shaped curves. For the best bid queues, it is skewed towards the left 
and for the best ask queues, it is skewed towards the right. It is curious that for 
the stocks traded on NASDAQ, we have this universal skewed {/-shapes for the 
volatilities. But the data for the NYSE tend to be noisier and the pattern is not 
very clear. This once again indicates the very different natures of the level-1 limit 
order dynamics across different exchanges. 

We summarize the statistics of the correlations in Table As we can see, 
the correlation is almost always negative. In terms of the numbers, the strongest 
correlation is 0.02 achieved by the Bank of America stock traded on NYSE with 
imbalance between 0.05 and 0.10. The most negative correlation is achieved by JP 
Morgan traded on NYSE, which is —0.34, that is far away from —1. One interesting 
observation is that when the imbalance is between 0.2 and 0.8, from Tablewe can 
see that the correlation of the stock traded on NYSE is always more negative than 
the correlation of the same stock traded on NASDACQ As we mentioned earlier, 
the fragmentation and discrepancy of the stock exchanges is well documented in 


^with the exception of JP Morgan when the imbalance is between 0.55 and 0.60 
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Best Bid Queue 

,1 -IDKArkrnan 8. C'n at MAQDA/^ 


Best Ask Queue 

JPMorgan Chase & Co. at NASDAQ 



Imbalance 

Best Bid Queue 

JPMorgan Chase & Co. at NYSE 


Imbalance 

Best Ask Queue 
JPMorgan Chase & Co. at NYSE 



Imbalance 


Imbalance 


Figure 5. Positive and Negative Changes of the Volumes at the 
Best Bid and the Best Ask of JP Morgan & Chase at NASDAQ 
and NYSE. The curve is the ratio of the positive changes to the 
total changes. 


the literature. For example, we can ask the question why the correlation of stocks 
traded on NYSE is more negative than that of NASDAQ. 

3. A Reduced-Form Model 

The simplest continuous time diffusion model to describe the dynamics of the 
level-1 limit order book is the correlated Brownian motions, where Q^{t) and 
are the queue lengths at the best bid and the best ask normalized by the median 
size of the queues, see e.g. Avellaneda et al. [2]: 

(3.1) dQ\t) = adW\t), Q\0) = x, 

(3.2) dQ“(t) = adW‘^it), g“(0) = y, 

where W^{t) and W°‘{t) are two correlated standard Brownian motions with corre¬ 
lation — 1 < p < 1. 

We are interested in the probability of the price movement. The probability that 
the price moves up and down are given respectively by 

(3.3) P,p = P(t“ < T^), Pdown = P(t'’ < t“), 
where 

(3.4) t“ := inf{t > 0 : Q°‘{t) < 0}, := inf{t > 0 : Q^{t) < 0}. 
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Bank oi America at NASDAQ Bar)k of America at NASDAQ Bar)k of America at NASDAQ 







Figure 6. Correlations and Standard Deviations of the Volumes 
at the Best Bid and the Best Ask of Bank of America at NASDAQ 
and NYSE 


General Elect'ic at NASDAQ 


General Electric at NASDAQ 


General Electric at NASDAQ 






General Electric at NYSE 



Figure 7. Correlations and Standard Deviations of the Volumes 
at the Best Bid and the Best Ask of General Electric at NASDAQ 
and NYSE 


Let the probability of price moving up be: 

(3.5) u { x , y ) := P(t“ < t'’|(3^(0) = x , Q“(0) = y ). 
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Imbalance 


General Motors at NASDAQ 



General Motors at NYSE 



General Motors at NASDAQ 



General Motors at NYSE 



Figure 8. Correlations and Standard Deviations of the Volumes 
at the Best Bid and the Best Ask of General Motors at NASDAQ 
and NYSE 



Imbalance 

JPMorgan Chase & Co. at NYSE 



JPMorgan Chase & Co. at NASDAQ 



JPMorgan Chase & Co. at NASDAQ 



JPMorgan Chase & Co. at NYSE 




Figure 9. Correlations and Standard Deviations of the Volumes 
at the Best Bid and the Best Ask of JP Morgan & Chase at NAS¬ 
DAQ and NYSE 


It is known that, see e.g. Avellaneda et al. [2]: 


(3.6) 


u{x,y) = - 


arctan | 


1 - 


/ i-i-p v-^ \ 

l-p y+x J 


arctan 


m) 



























Probability of Price Up Probability of Price Up Probability of Price Up 
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Bank of America at NASDAQ Bank of America at NYSE 




Figure 10. Empirical Probability (Dotted Lines) and Model Pre¬ 
diction (Solid Lines) of Bank of America 


General Electric at NASDAQ General Electric at NYSE 



Figure 11. Empirical Probability (Dotted Lines) and Model Pre¬ 
diction (Solid Lines) of General Electric 

Generai Motors at NASDAQ General Motors at NYSE 




Figure 12. Empirical Probability (Dotted Lines) and Model Pre¬ 
diction (Solid Lines) of General Motors 




















14 


TZU-WEI YANG AND LINGJIONG ZHU 


JPMorgan Chase & Co. at NASDAQ JPMorgan Chase & Co. at NYSE 



Figure 13. Empirical Probability (Dotted Lines) and Model Pre¬ 
diction (Solid Lines) of JP Morgan Chase & Co. 


When there is no correlation, i.e., p = 0: 


(3.7) 


u{x, y) = — arctan 

TT 



When the correlation is perfectly negative , i.e., p = —1: 


(3.8) 


u{x,y) 


X 

x + y' 


From (3.6), we can see that the probability of price moving up can be written as a 
function depending only on the imbalance: 


(3.9) 


^up(^) — 2 


arctan I 


1 - 


arctan I+p') 


i-p) 


where z = Moreover, P^piz) is monotonically increasing in the imbalance z. 

Remark 1. More generally, we can assume that the diffusion processes are corre¬ 
lated Brownian motions with constant drifts: 

(3.10) dQ\t) = p^dt + a^dW^{t), Q'’(0) = x, 

(3.11) dQ“(t) =Ai“dt + o-“dW“(t), Q“(0) = ?/. 


Based on the results in Iyengar E] and Metzler mi, we have 


(3.12) u{x,y) = 


^7“(r cos a —z“)+7^(r sin a — z^)— G ) +(')' ) i 


g(t, r)drdt, 


where 


(3.13) 




aVl-p" 


-1 

■ ' 


■ ■ 


cr“A/l-p2 cr“p 

-1 n 

[ 7^ J 


0 

\ 


[ \ 

5 

. . 


0 cr^ 



H 
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and g{t,r) = ^ ^siii(n7r(a - Oo)/a)In^/airro/t), where is 

the modified Bessel function of the first kind and 


{ TT + arctan(— f p), p > 0 

f, p = 0 

arctan(—-^/l — p^/p), 


p < 0 


ro := ^{xfa^Y + {yla°-Y - 2p(xja'^){yja°-)j\/\- 


TT + arctan 


y !rr’^ -px/a<> 


% ■■ = 


arctan 


In particular, when = pf = 0, 


j//ct“ < pa;/c^^ 
p/cr“ = px/cr^ 
y/<y°‘ > pxja^. 


(3.14) 


u{x,y) = —. 

a 


In Avellaneda et al. [2], the authors fitted the empirical probability of mid¬ 
price moving up by the correlated Brownian motion model when the correlation is 
p = — 1, that is, (3.81. From Figures [To][^ and Table 4][5 the empirical probability 
of mid-price moving up is indeed linearly dependent on the imbalance. However, 
as we have already seen in Figures that the correlation is negative, but 

far away from —1, and it also depends on the level of the imbalance. Therefore, a 
perfect negatively correlated Brownian motions model might not fit both the em¬ 
pirical probability and the empirical correlation. We will propose a non-parametric 
diffusion model that can fit the empirical correlation, empirical volatilities, and 
empirical probability of price movement simultaneously. 

The correlated Brownian motion is simple yet still captures the phenomenon 
that the price movement is mainly driven by the imbalance at the best bid and 
ask level. We are interested to investigate further the relation of the dynamics 
of the volumes at the best bid and ask level and the imbalance. The assumption 


in the model (3.6) that the correlation and volatility of the volumes at the best 


bid and ask levels are constant might be oversimplified and not consistent with 
the real data. Indeed, the empirical studies we did in Section suggests that the 
correlation of the movements of the volumes at the best bid and ask level is non- 
trivially dependent on the imbalance. Two universal shapes for the correlation as 
a function of the imbalance are the t/-shaped curve and W-shaped curve. For a 
17-shaped correlation function of the imbalance, the correlation is negative and it 
is close to zero when the imbalance is close to 0 and 1 and it is the most negative 
when the imbalance is close to Similarly we also observe VF-shaped correlation 
curves. The correlations are consistently negative though far away from —1. We 
also observe that the volatilities of the volumes at the best bid and ask levels also 
depend non-trivially on the imbalance. The volatility is in general large when 
the imbalance is small or large and the volatility is small when the imbalance is 
moderate. The difference here is that instead of a symmetric [/-shaped or IF- 
shaped curve, we often get two skewed [/-shaped curves, depending on whether we 
consider the best bid or the best ask. Therefore, our goal is to improve the model 
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(3.6) to allow the correlation and volatilities to be non-constant and depend on the 
level of the imbalance. 

In a very loose analogy, in the literature of the pricing of derivative securities, 
it is well known that the stock price has the so-called leverage effect, that is, the 
volatility of a stock tends to increase when the stock price drops, which is one of 
the key reasons that people have used the CEV models and other local volatility 
models as an alternative to the Black-Scholes model in which the volatility is always 
constant. 

We are interested to build up a model for the dynamics of the level-1 limit order 
books, that can capture the empirical evidence that we observed from the data. 

Let us build a discrete model and find its diffusion approximation. Let X{t),Y (f) 
be the queue lengths at the best bid and the best ask at time t and Zt = x(^+Y(t) 
be the imbalance. Let us assume that 


• The limit orders that arrive at the best bid is a simple point process (f) 
with intensity X^{Zt-) at time t; 

• The market orders or cancellations that arrive at the best bid is a simple 

point process with intensity X'^{Zt_) at time t; 

• The limit orders that arrive at the best ask is a simple point process (f) 
with intensity X^{Zt-) at time t\ 

• The market orders or cancellations that arrive at the best ask is a simple 

point process with intensity A^(Zt_) at time t; 

• There are simultaneous cancellations at the best ask and limit orders at 
the best bid that is a simple point process N^{t) with intensity X^(Zt-) at 
time t; 

• There are simultaneous cancellations at the best bid and limit orders at 
the best ask that is a simple point process N^{t) with intensity X^{Zt-) at 
time t] 

The last two assumptions above are made due to the observation that the empirical 
correlation between the best bid and ask queues are always negative. Note that 
is the arrival rate for the idiosyncratic limit orders at the best bid, so the total 
arrival rate for the limit orders at the best bid is A^ -I-A®. Similarly, the total arrival 
rate for the limit orders at the best ask is A^ -I- A®. For 1 < j < 6, we assume that 
X^{z) = X^{-^^) : K —>■ M+ are continuous and bounded (there is singularity when 
X + y = 0 and we assume analytic continuation of A-1 at the singularity). Finally, 
for simplicity, we assume that the order size has unit size 1. Note that all the 
following arguments work if we assume constant order sizes for different types of 
orders. Therefore, the dynamics at the best bid and ask are given by: 

(3.15) dX{t) = dN\t) - dN‘^{t) + dN^{t) - dN^{t), 

dY{t) = dN^{t) - dN^{t) + dN^{t) - dN^{t). 


Since empirically, we do not observe strong evidence for the drift effect, we assume 
the driftless condition: 


(3.16) Ai(z) - X^iz) = X%z) - X^(z) = X^(z) - X^(z), 

so that X(t) and Y(t) are driftless, in the sense that 

dX(t) = dM\t) - dM'^it) + dM^t) - dM^{t) 
dY{t) = dM^it) - dM‘^{t) + dM^(t) - dM^t), 
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where for any 1 < j < 6, (t) := N^{t) — f* X^{Zs-)ds is a martingale. For the 

high frequency trading, the number of orders is large and the trading frequency 
is high, so we can rescale time and space to get a diffusion approximation to the 
discrete model. Let us define the rescaled process for 1 < j < 6, 


(3.17) 




,(t) := -^X{nt), 

\/n 


Ynit) := -^Y{nt), 
\/n 


Miit) := 

\/n 


The discrete model (3.151 describes the dynamics of the best bid and ask queues 


at the micro level, but may not be easy to work with when we are interested to 
compute the probability of mid-price movement. So, next, let us find a diffusion 


approximation to the discrete model (3.15). 
Let us assume that 


(3.18) 


dQ\t) = <7\Z{t))dW\t), Q^(0) = a; > 0 

dQ“(t) = a“(Z(<))dlF“(t), Q“(0) = 2/>0 

Z(t) = 


Q'>(t) + Q“(t)’ 

where W’’{t) and W°‘{t) are two standard Brownian motions with correlation p{Z{t)) 
at time t. We assume that {x,y) i-)- cr^(y^), ix,y) !->■ cr“(^^^) are bounded and 
continuous from to K+, and {x,y) !->■ p(y^) is bounded and continuou^ from 
to [—1,1], so that there exists a unique solution to (3.18), see e.g. |16j If in 
addition, we assume that a^, cr“, p are Lipschitz, then the solution is guaranteed to 
be strong, see e.g. [El- 

Note that the discrete process (X„(t), y„(t)) and (Q^(t), Q“(t)) should both live 
in the first quadrant. But to avoid the well-definedness after the process hitting the 
boundary of the first quadrant, we make the processes well-defined on Since 
our goal is to compute the probability of mid-price movement, which is about the 
first hitting time of the boundary of the first quadrant, the extension from the first 
quadrant to will not alter the results, and it is just for the sake of convenience. 


The discrete model (3.15) can be approximated by the diffusion model (3.18) as 
follows. 

Theorem 2. Given that (X(t),Y(t)) is the discrete model of the best bid and ask 
queues in (3.15), assume that for 1 < j < 6, {z) : K —>■ K'*' are continuous 

and bounded functions and the driftless condition holds. Also assume that 

(X„(0 ), Tn(0)) = ix,y) € M"*" X IR+. Then the rescaled process (X^ jt), Y n(t)) in 
(3.11) converges weakly in D[0,T] as n —>■ oo to {Q^{t), Q°'ft)) in (3.18), where 


I1[0, T] is the space of cddldg processes equipped with Skorohod topology. In addition, 
the diffusion and correlation coefficients are the explicit functions of the intensities 
X^{z): 

a\z) = [AH^) + X^{z) + X^z) + A®(z)] 
a“(z) = [a3(z) + X\z) + X^z) + A^(z)] 

A5(z) -f A®(z) 


piz) = -- 


(j^{z)a^{z) 


^Note that can be singular, and are defined as the analytic continuation at the 

singular points dico. 
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The probability of mid-price movement for the diffusion model (3.18) can be 
computed in the closed-form as follows. 


Theorem 3. Given the model (3.18), Pup{z), the probability of the price moving 
up, defined in (3.3) and (3.4), is explicitly given by 

fje ^0 


Pup(z) = ^ 


dy 


/c 


1 - 
e -^0 




dx 


dy 


(3.19) 

where z is the imbalance and 

(3.20) fj,{z) = —2(1 — z)a^{z)'^ + 2(2z — 1 )p(z)ct*’(z)o-“(z) -|- 2za‘^{z)'^ 

v{z) = (1 — z)^(T^(z)^ — 2z(l — z)p{z)a^{z)a°'{z) + z^a°‘{zY. 

Remark 4. What we are really interested to compute is the probability of mid-price 
movement for the discrete model, and this can be approximated by the probability 
of mid-price movement for the diffusion model which has closed-form formula, that 
is given in Theorem^ For any n > 0, we have 

P(X(t) hits zero before Y{t) does) = P ( -^X(nt) hits zero before -^Y{nt) does 

\i/n i/n 

~ P((3^(t) hits zero before Q°'{t) does), 

as n ^ oo. Note that the approximation requires that Q^{0) = ■^X{0) and Q“(0) = 
;^y(0) and this is still reasonable since the formula in Theorem^ only depends 
on the ratio ofQ^{0) and Q“(0) so we can rescale the initial condition. 

Remark 5. We can recover the results in [2] from Theorem^ 

(1) When = a°‘ = a and p = —1, we have p{z) = 0 and thus Pup{z) = z = 

which recovers (3.8). 

(2) When = a°' = a and p = 0, we have p,{z) = —2(1 — 2z)cr^ and v{z) = 
[(1 — z)^-I-z^]cr^. Therefore, 


Pupiz) = 


2i^ + 2x2 


arctan(l — 2z) — 


dy 


Jo e-^°smy-i)y+i)dy ^ 


f — arctan(l — 2z) tt 
— -=- = w arctan 


0 l-2y-|-2y2 

z 


dy 


= — arctan 

TT 


4 4 

' X 


1 - z 


which recovers (3.7). 


(3) In the special case that cr°(-) = ct“(-) and p(-) = p, by (3.14), we have 


u{x,y) = 9o/a 

y < px 

H 

II 

y> px 

p > 0 

TT— arctanf A ) 

V px — y ) 

7r/2 

arctan(A ) 

V px — y ) 

TT— arctan A 

TT— arctan A 

TT— arctan A 

O 

II 

N/A 

1 

- arctan ( A ) 

TT \ px-y ) 

p < 0 

N/A 

7t/2 

— arctan A 

arctanf A ) 

V p^-y) 

— arctan A 


where A = — p^ jP- 
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Remark 6. When p{-) = —1, we have = A ^ = A^ = 0 and A®(-) = A®(-). 

And thus cr^(-) = cr“(-)- By using the assumption 3.16 we can check that the 
probability of mid-price moving up u{x,y) = satisfies (A.4| and thus this is 
the probability of mid-price movement for the reduced-form model. Indeed, for the 
original unsealed discrete model (3.15), u{x,y) satisfies the equation 
(3.21) 




x + y 


[u{x + 1,2/-!)- u{x, y)] + A® 


x-\-y 


[u{x - 1, 1 / + 1) - u{x, y)] = 0, 


for {x,y) S Z>o x Z>o with boundary condition u{0,y) = 0 and u(x,0) = 1. It is 


easy to see that u{x, y) = 


x+v • 


Indeed, this result is true for perfectly negatively 


correlated queues model-free. To see this, notice that X(t) and Y{t) are perfectly 
negatively correlated martingales and we can write X(t) = x -\- M(t) and Y (t) = 
y — M{t), where M(t) is a martingale. Therefore, the probability that X{t) hits 
0 before Y(t) does is the same as the probability that M(t) hits —x before y. By 
optional stopping theorem for martingales, this probability can be easily computed 
as . 

x+y 

As we can see from Figures the empirical probability of mid-price 

moving up (dotted lines) is linearly dependent on the imbalance. Except the BAG 
at NASDAQ and the GE at NASDAQ, there is also a strong numerical evidence 
of the hidden liquidity, that is, the probability of moving up is bigger than zero 
when the imbalance is near zero and less than one when the imbalance is near one. 
To better fit the data, we introduce the hidden liquidity H S (0,1), so that the 
theoretical probability of mid-price moving up is H for imbalance at zero and 1 — H 
for imbalance at on^ That is, Pupiz) satisfies the boundary condition Pup(O) = H 
and Pup(l) = 1 — 11. Following the proofs of Theorem|^ we get the following result 
(the proof will be given in Appendix . 

Theorem 7. Given the model ^3.13^ , Pup{z), the probability of the price moving up, 
defined in (3.3) and { 3 . 4 ) with the boundary conditions Pup{0) = H, Pup{l) = l—H, 
is explicitly given by 


(3.22) 


Pup{z) = H + {I - 2H) 


Jo 


_ ry m(i) j 
o Jo y + 


dy 


Jo 


_ ry tt+lux , ’ 
e Jo dy 


where z = is the imbalance and p, v are the functions in (3.20). 

To fit the data, we use the empirical data for cr®(-), cr“(-) and p{-) to plug into the 
formula (3.22) and obtain Ftheoreticai(-z, H), the theoretical probability of mid-price 
moving up at imbalance level 2 ; and then use the least square method 

(3.23) min'^^(Fenipirical(' 2 ^) Ftheoretical(. 2 :, i7)) , 

H ^ ^ 
z 

to find the best fitting hidden liquidity H. The solid lines in Figures [TOl [TTj [T^ [I^ 
are the model predictions. Remarkably, the solid lines are almost linear in imbalance 


^This is slightly different from the definition of hidden liquidity in Avellaneda et al. [2]. Our 
definition of H can be interpreted as a probability, a free parameter that helps get Pup closer to 
the empirical data in the least-square sense rather than a real liquidity as in Avellaneda et al. [2]- 
We choose this definition to keep the analytical tractability of the model and also our definition 
is simpler in the sense that we only need to bucket the data and discretize our analytical formula 
according to the imbalance level rather than the best bid and ask queue sizes as in 
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even though the correlation is a complicated function of imbalance and far away 
from being —1. Hence, we have built up a model with the empirical correlation 
and volatilities as the input that can produce an analytical formula for the price 
movement probability that can be used to fit to the empirical probability data 


4. Conclusions 

We did numerical studies of the drift effect, correlation and volatility of the best 
bid and ask queues and how they depend on the imbalance of the volumes at the 
best bid and ask queues from the level-1 limit order books data from the WRDS. 
We discovered that there is little evidence for the drift except when the imbalance 
is small or large. The correlation as a function of the imbalance exhibits universal 
behaviors, which is either a C-shaped or a W-shaped curve, and it is almost always 
negative though far away from —1. The volatility is much more noisy and in 
general lacks a clear pattern though very often exhibit skewed t/-shapes. All the 
empirical results are highly stock and also exchange dependent, which suggests that 
the dynamics of the limit order books are very sensitive to their particular stock 
and also exchanges. Based on our empirical discoveries, we built up a discrete 
model for the dynamics of the best bid and ask queues and showed that it can 
be approximated by a reduced-form diffusion model with functional dependence of 
the drift, correlation and volatility on the imbalance, which therefore generalizes 
the correlated Brownian motion model that is commonly used in the limit order 
books literature. Our reduced-form model still keeps analytical tractability, and 
it is self-consistent when it is fit to the data of both the empirical probability of 
mid-price movement and the empirical correlation/volatility. 
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Appendix A. Proofs 


Proof of Theorem Notice that (t) are martingales with predictable quadratic 
variation X^{Zs-)ds, where is bounded, i.e. HA-^ Hoo < oo. For any t S [0, T—5], 
(5 > 0, we have 
(A.l) 

6 

E [ 1 1 (A„ (t + J), Y„ (t + 5)) - (A„ (t), Y„ (t)) f ] < C ^ E [(Mi( (t + 5) - Mi’ (t) )4] , 

i=i 

for some constant C > 0. By Burkholder-Davis-Gundy inequality, for any 1 < j < 

6 , 


(A.2) ¥.[{Mi{t + 6)-Mi{t)f]<^¥. 


rit+S)r 


X^{Zs)ds 


<C||A^■||oo<5^ 


for some constant C > 0. Therefore, by applying Kolmogorov’s tightness criterion, 
we can show that (A„(f), Y„(t)) is tight. 
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The infinitesimal generator for the rescaled process y„(t)) is given by 

1 


f [x + -j^,y ) - fix,y) 


where z = 


C-nf{x,y) := n\^{z) 

+ n}?{z) 

+ nX^{z) 

+ nX'^{z) 

+ nX^{z) 

+ nX^{z) 

and / is a twice continuously differentiable test function. 


f {x- -^,y ) - f{x,y) 
/ ( a;, 2 /+ - f{x,y) 

/ ( 2^,2/ - - f{x,y) 


f ( x+^,y - 

Wn Wi 


- f{x,y) 




x+y 


By using the driftless assumption p.l6[ ), we get 
^nfix,y) := nX^(z) 


19 / 1 9 ^/ - 3 / 2 \ 


+ nX^(z) 


+ nA^(z) 


+ nA^(z) 


19/ 1 9^/ -3/2\ 

“Trife \ 

19/ 1 9^/ -3/2\ 

19/ 1 9^/ -3/2^ 


^5/ 1 1 1 9/ 1 9V 1 9V 1 9V _3/2 ' 

” ^ _y/ndx dy 2n dx"^ 2n dy'^ n dxdy ^ 

,6. N [_1 9/ 1 9/ 1 9V 1 9V 1 d^f _3/2.' 

” ^ _ -v/n 9a; ^2/ 2n dx'^ 2n dy"^ n dxdy ^ 

''^)l £ + i Uy 


Z) + A (Z 

d^f 


dxdy ^ 

As n-> oo, Cnfix, y) -)■ /l/(a;, y), where 

cn^.y) = 


cr°(z) := |A"(z 


a“(z) := 
p(z) := - 


6, .11/2 


A£z) + A6(z) 


92 / 

92/2 


a^{z)(7°-{z) 

Also notice that by our assumption, the initial condition satisfies (A1„(0), T„(0)) = 
(x, y) G M+ X M+. The tightness gives the relative compactness of the sequence and 
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and the convergence of infinitesimal generators gives convergence in distribution 
for finite fixed time point, which guarantees the weak convergence on D[Q,T], see 
e.g. Theorem 7.8(b) of Chapter 3 in Ethier and Kurtz |^. Hence (X„(t), => 

(Q^(t),Q“(t)) on 7^[0,T]. □ 


Proof of Theorem^ Recall that the price moves up is: 

(A.3) Pupix, y) = u{x, y) = P(t“ < t^\Q^( 0) = x, Q“(0) = y). 

Then, u{x,y) satisfies the PDE: 

(A.4) ^ 

with the boundary condition: u{0,y) = 0 and u(a;,0) = 1, where 2 = is the 
imbalance. 

Assuming that u{x, y) is a function of z so that u{x, y) = u{z), by the chain rule. 


(A.5) 

(A.6) 

(A.7) 

(A.8) 


du y , du x , 

-rr = 7 -(■ 2 ): 77 - = —7 - (Z) 

dx {x + yY dy {x + yY 




d'^u 
dx'^ 

d^u _ _ 

dy'^ (x + yY 
d'^u 


{x + yY 
2 x 


Az) 


y 


u"{z) 


u'{z) 


{x + yY 

^2 

u"{z) 


dxdy {x + yY 
Hence, the PDE reduces to: 


x-y ,, X 

u {z) - 


(x + yY 
xy 


{x + yY 


u'(z). 


Y(zY 


2y 


■/(z) 


(x + yY (x + yY 

+ 2p(z)A(z)a°‘(z) 


^ u"Y) 


x-y , xy „ 

■ ^ (^) - 7—W 


_{x + yY 


+ aYz? 


2 x 


{x + yY 


{x + yY 

Az)- 


{x + y) 


y{z) 


= 0 , 


which can be further reduced to the ODE: 

A{zY \—‘^z'^{l — z)u' [z) + (1 — zYz'^u" {z)\ 

+ 2 p{z)A{z)(T°‘{z) [z^(2z — l)u' {z) — z^{l — z)u"{z)\ 

+ <y°'{zY \2z'^u'{z) + z‘^u"{z)\ = 0, 

with the boundary condition m( 0) = 0 and u(l) = 1, which can be rewritten as 
(A.9) p{z)f{z)+ v{z)f'{z) =Q, 


where 


p{z) = —2(1 — z)A{zY + 2(2z — l)p{z)a’’{z)a°‘{z) + 2za°‘{zY 
iy(z) = (1 — z)^A(z)^ — 2z(l — z)p{z)A{z)a‘^{z) + z^a°‘{zY, 
and f(z) = u'(z). This is a first-order linear equation with solution 
(A.IO) f{z)=Ae-^oyYY 
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and hence 

(A.ll) u{z) = Ci [ e~-^o '^‘^'^dy + C2, 

Jo 

where Ci , C2 are two constants to be determined. By using the boundary conditions 
m( 0) = 0 and u(l) = 1, we conclude that 


(A.12) 


— -fup(-^) — u{z) — 


/i 


Z — fy 


dy 


1 _ ry EiElfJr , 

Jq e ■'0 dy 


□ 


Proof of Theorem\^ u{z) as in Theoremsatisfies the ODE: 

(A.13) yL[z)u' {z) + u{z)u''{z) = 0, 

now with the boundary conditions u{0) = H and u(l) = 1 — H, where 
yt{z) = —2(1 — z)a^{z)‘^ + 2(2z — 1 ) p(z)a-^ (z)a°‘ (z) + 2za°'(z)‘^ 
viz) = (1 — z)^cr^(z)^ — 2z(l — z)p{z)a^{z)a°'{z) + z^cr“(z)^. 

As in the proof of Theorem this ODE has the solution of the form 

(A. 14) u{z) = Cl f e~ ^^‘^^dy + C 2 , 

Jo 

where Ci, C 2 are two constants to be determined. By using the boundary conditions 
m(0) = H and u(l) = I — H, we conclude that 


(A.15) 


P,p(x, y) = Pup{z) = u{z) =HP{1-2H) 


r 

Jo 


=-/c 


0 


dx 


dy 


/i 


1 - C nHdx 

(3 JQ 


u(x) 


dy 


□ 
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Appendix B. Tables 


Table 3. Summary of Volume Changes (NASDAQ) 


Imbalance 

BAC b 

BAC a 

GE b 

GE a 

GM b 

GM a 

JPM b 

JPM a 

O.O-O.l 

0.349 

0.499 

0.367 

0.508 

0.640 

0.462 

0.628 

0.496 

0.1-0.2 

0.449 

0.519 

0.461 

0.503 

0.508 

0.488 

0.580 

0.487 

0.2-0.3 

0.458 

0.542 

0.439 

0.513 

0.480 

0.517 

0.498 

0.507 

0.3-0.4 

0.472 

0.556 

0.478 

0.536 

0.482 

0.532 

0.450 

0.542 

0.4-0.5 

0.503 

0.537 

0.507 

0.539 

0.508 

0.531 

0.472 

0.529 

0.5-0.6 

0.533 

0.508 

0.529 

0.518 

0.532 

0.510 

0.514 

0.448 

0.6-0.7 

0.540 

0.488 

0.537 

0.490 

0.530 

0.489 

0.518 

0.477 

0.7-0.8 

0.540 

0.467 

0.528 

0.469 

0.514 

0.471 

0.527 

0.514 

0.8-0.9 

0.522 

0.479 

0.523 

0.472 

0.494 

0.498 

0.516 

0.527 

0.9-1.0 

0.486 

0.310 

0.496 

0.378 

0.463 

0.643 

0.508 

0.696 


Table 4. Summary of Volume Changes (NYSE) 


Imbalance 

BAG b 

BAG a 

GE b 

GE a 

GM b 

GM a 

JPM b 

JPM a 

O.O-O.l 

0.561 

0.540 

0.586 

0.596 

0.730 

0.598 

0.723 

0.602 

0.1-0.2 

0.524 

0.531 

0.502 

0.552 

0.557 

0.506 

0.532 

0.531 

0.2-0.3 

0.514 

0.529 

0.490 

0.531 

0.502 

0.497 

0.491 

0.512 

0.3-0.4 

0.518 

0.541 

0.505 

0.522 

0.494 

0.497 

0.480 

0.511 

0.4-0.5 

0.517 

0.553 

0.518 

0.529 

0.485 

0.486 

0.492 

0.507 

0.5-0.6 

0.539 

0.511 

0.539 

0.518 

0.501 

0.491 

0.510 

0.494 

0.6-0.7 

0.551 

0.506 

0.528 

0.513 

0.496 

0.496 

0.499 

0.485 

0.7-0.8 

0.534 

0.516 

0.541 

0.503 

0.498 

0.503 

0.506 

0.491 

0.8-0.9 

0.518 

0.535 

0.553 

0.512 

0.510 

0.550 

0.527 

0.539 

0.9-1.0 

0.554 

0.582 

0.575 

0.551 

0.586 

0.718 

0.598 

0.742 
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Table 5. Summary of Correlation of the Best Bid and Ask 


JPM(N) 

1 

1 

o 

i 

o 

o 

-0.29 

-0.29 

-0.34 

-0.34 

-0.32 

-0.34 

-0.33 

-0.33 

-0.34 

-0.26 

-0.29 

-0.28 

o 

CO 

o 

-0.29 

-0.26 

00 

1 

o 

1 

-0.17 

-0.06 

JPM(T) 

1 

o 

i 

CO 

i-H 

o 

-0.17 

-0.17 

1 

CM 

O 

i 

CM 

CM 

O 

-0.24 

CO 

CM 

O 

CM 

O 

iO 

CM 

O 

lO 

CM 

O 

-0.27 

CM 

O 

-0.24 

CM 

CM 

o 

o 

CM 

o 

CM 

CM 

O 

o 

CM 

o 

1 

i-H 

o 

<X) 

o 

o 

( 

GM(N) 

-0.07 

00 

i-H 

o 

iO 

CM 

o 

1 

i-H 

CM 

O 

1 

CO 

o 

i 

O 

CO 

o 

00 

CM 

O 

i 

00 

CM 

o 

CM 

o 

00 

CM 

o 

-0.27 

00 

CM 

o 

CM 

o 

00 

CM 

O 

00 

CM 

o 

-0.27 

lO 

CM 

o 

CM 

CM 

O 

i 

i-H 

o 

00 

o 

o 

1 

GM(T) 

o 

o 

1 

i-H 

i-H 

o 

1 

o 

1 

i-H 

o 

1 

o 

1 

-0.17 

1 

o 

1 

00 

1-H 

o 

o 

CM 

o 

o 

CM 

o 

o 

CM 

O 

1-H 

CM 

O 

o 

CM 

o 

1-H 

o 

T— I 

o 

i-H 

o 

T—1 

o 

CM 

1 

o 

1 

00 

o 

o 

CO 

o 

o 

1 

GE(N) 

o 

o 

1 

-0.10 

o 

1 

o 

1 

-0.18 

-0.16 

-0.17 

1 

o 

( 

-0.19 

T—i 

CM 

O 

-0.22 

o 

CM 

o 

o 

CM 

o 

-0.17 

-0.18 

-0.15 

-0.16 

-0.16 

-0.15 

o 

o 

o 

1 

o 

1 

GE(T) 

iO 

o 

o 

1 

GO 

o 

o 

o 

o 

1 

CO 

i-H 

o 

o 

1 

o 

1 

o 

o 

-0.07 

o 

o 

o 

o 

o 

o 

o 

o 

-0.07 

O 

O 

00 

o 

o 

o 

o 

i-H 

i-H 

o 

CM 

T—1 

o 

o 

1 

o 

1 

o 

o 

o 

o 

1 

BAG(N) 

o 

o 

o 

CM 

o 

o 

CM 

o 

o 

1 

o 

o 

o 

o 

o 

i 

CO 

i-H 

o 

CO 

1 

o 

1 

CM 

i-H 

o 

lO 

T—1 

o 

iO 

1-H 

o 

CM 

T—1 

o 

CM 

i-H 

o 

CM 

T—i 

o 

1-H 

1-H 

o 

o 

T— H 

o 

Oi 

o 

o 

i> 

o 

o 

o 

o 

1 

CO 

o 

o 

o 

o 

o 

BAG(T) 

CO 

o 

o 

1 

CM 

i-H 

o 

o 

o 

1 

o 

o 

o 

o 

1 

o 

o 

o 

o 

1 

o 

o 

o 

o 

iO 

o 

o 

o 

o 

o 

o 

lO 

o 

o 

iO 

o 

o 

o 

o 

-0.07 

T—1 

T—i 

o 

CM 

1 

o 

1 

o 

i-H 

o 

<;o 

o 

o 

1 

Imbalance 

0.00-0.05 

0.05-0.10 

0.10-0.15 

0.15-0.20 

0.20-0.25 

0.25-0.30 

0.30-0.35 

0.35-0.40 

0.40-0.45 

0.45-0.50 

0.50-0.55 

0.55-0.60 

0.60-0.65 

0.65-0.70 

0.70-0.75 

0.75-0.80 

0.80-0.85 

0.85-0.90 

0.90-0.95 

0.95-1.00 
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Table 6. Summary of Empirical Probability (E) and Model Pre¬ 
diction (P) for Stocks Traded on NASDAQ 


JPM(P) 

0.158 

0.196 

0.235 

0.275 

0.315 

O 

iTi) 

CO 

o 

0.396 

0.436 

0.475 

0.514 

CO 

o 

0.591 

0.630 

0.668 

0.706 

0.744 

o 

00 

o 

0.815 

00 

00 

o 

00 

00 

o 

JPM(E) 

0.077 

0.137 

0.225 

0.254 

0.285 

o 

CO 

CO 

o 

0.378 

o 

o 

o 

o 

00 

o 

0.540 

0.591 

o 

O 

0.639 

0.645 

0.668 

0.722 

0.791 

0.799 

00 

1 

o 

GM(P) 

0.176 

0.211 

0.247 

00 

o 

0.322 

0.359 

0.394 

0.429 

0.463 

0.497 

0.531 

0.566 

0.602 

0.639 

0.677 

0.716 

0.754 

0.791 

0.825 

0.856 

GM(E) 

0.102 

0.197 

0.245 

0.267 

0.306 

0.342 

0.367 

0.412 

0.463 

0.482 

0.503 

0.533 

0.570 

0.604 

0.644 

0.679 

00 

O 

0.740 

0.810 

0.902 

GE(P) 

O 

1 

o 

0.144 

0.186 

0.229 

0.273 

0.319 

0.364 

0.409 

0.454 

00 

o 

0.542 

0.586 

0.631 

0.676 

0.722 

0.766 

0.811 

0.855 

0.896 

0.933 

GE(E) 

0.053 

0.127 

0.168 

0.186 

0.221 

0.231 

00 

O 

CO 

O 

0.364 

0.397 

0.441 

o 

o 

0.545 

0.600 

0.653 

0.669 

0.749 

0.762 

0.799 

00 

o 

O 

CO 

o 

BAG(P) 

0.072 

0.131 

1 

00 

1 

o 

0.224 

0.263 

0.301 

0.337 

0.374 

0.410 

0.447 

00 

o 

0.523 

0.562 

0.603 

0.646 

0.693 

0.750 

0.825 

0.912 

o 

o 

o 

1 

BAG(E) 

O 

O 

0.139 

0.136 

0.159 

0.215 

0.270 

0.307 

0.351 

0.426 

0.441 

0.492 

0.539 

0.628 

0.658 

o 

o 

0.737 

0.783 

00 

O 

00 

00 

o 

0.960 

Imbalance 

0.00-0.05 

0.05-0.10 

0.10-0.15 

0.15-0.20 

0.20-0.25 

0.25-0.30 

0.30-0.35 

0.35-0.40 

0.40-0.45 

0.45-0.50 

0.50-0.55 

0.55-0.60 

0.60-0.65 

0.65-0.70 

0.70-0.75 

0.75-0.80 

0.80-0.85 

0.85-0.90 

0.90-0.95 

0.95-1.00 
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Table 7. Summary of Empirical Probability (E) and Model Pre¬ 
diction (P) for Stocks Traded on NYSE 


JPM(P) 

0.179 

0.209 

0.241 

0.275 

O 

?—1 
CO 

o 

t- 

CO 

o 

0.385 

0.425 

0.465 

0.506 

0.545 

0.584 

0.623 

0.661 

0.696 

0.730 

0.763 

0.794 

CO 

CM 

GO 

O 

00 

o 

JPM(E) 

0.125 

0.174 

0.219 

0.272 

0.313 

00 

CO 

o 

0.428 

0.411 

CO 

o 

0.474 

0.522 

0.569 

00 

o 

O 

0.637 

0.664 

0.701 

o 

CO 

o 

0.771 

0.812 

0.875 

GM(P) 

0.217 

0.244 

0.272 

o 

CO 

o 

0.333 

0.365 

00 

CO 

o 

0.432 

0.466 

0.501 

0.536 

0.570 

0.603 

0.636 

0.667 

0.698 

0.727 

0.756 

CO 

00 

!>- 

o 

00 

o 

00 

o 

GM(E) 

0.157 

0.221 

0.277 

0.310 

0.341 

CO 

CO 

o 

0.393 

0.422 

0.443 

0.454 

0.516 

0.554 

0.572 

0.601 

00 

CO 

O 

0.659 

0.701 

0.729 

0.785 

0.856 

GE(P) 

0.251 

0.271 

0.292 

0.315 

0.340 

o 

CO 

o 

0.396 

0.428 

0.461 

0.495 

0.529 

0.562 

0.593 

0.622 

0.649 

0.675 

0.699 

0.723 

0.745 

0.766 

GE(E) 

00 
CO 
?—1 

o 

0.237 

0.332 

CO 

o 

0.347 

t- 

o 

CO 

o 

00 

CO 

O 

0.395 

0.435 

0.435 

0.476 

0.520 

o 

o 

0.580 

0.603 

0.619 

0.647 

0.660 

CO 

o 

0.860 

BAG(P) 

?—1 

?—1 

o 

0.216 

0.243 

0.272 

0.303 

00 

CO 

CO 

o 

0.375 

0.415 

0.458 

0.502 

0.546 

0.588 

0.628 

0.665 

0.700 

0.731 

0.760 

0.787 

0.811 

0.833 

BAG(E) 

0.128 

0.194 

0.249 

0.249 

0.276 

CO 

CO 

o 

0.377 

0.405 

o 

o 

0.442 

0.468 

0.490 

0.530 

0.590 

00 

O 

0.649 

0.723 

0.793 

CM 

CO 

00 

o 

0.924 

Imbalance 

0.00-0.05 

0.05-0.10 

0.10-0.15 

0.15-0.20 

0.20-0.25 

0.25-0.30 

0.30-0.35 

0.35-0.40 

0.40-0.45 

0.45-0.50 

0.50-0.55 

0.55-0.60 

0.60-0.65 

0.65-0.70 

0.70-0.75 

0.75-0.80 

0.80-0.85 

0.85-0.90 

0.90-0.95 

0.95-1.00 
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